Low leakage asymmetric sram cell devices

ABSTRACT

Asymmetric SRAM cell designs exploiting data storage patterns found in ordinary software programs wherein most of the bits stored are zeroes for data and instruction streams. The asymmetric SRAM cell designs offer lower leakage power with little impact on latency. In asymmetric SRAM cells, selected transistors are “weakened” to reduce leakage current when the cell is storing a zero. Transistor weakening may be achieved by using higher voltage threshold transistors, by varying transistor geometries, or other means. In addition, a novel sense amplifier design is provided that leverages the asymmetric nature of the asymmetric SRAM cells to offer cell read times that are comparable with conventional symmetric SRAM cells. Lastly, cache memory designs are provided that are based on asymmetric SRAM cells offering leakage power reduction while maintaining high performance, comparable noise margins, and stability with respect to conventional cache memories.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims the benefit of previously filed U.S. ProvisionalPatent Application Ser. No. 60/402,275 filed on Aug. 9, 2002 entitled,“LOW LEAKAGE ASYMMETRIC SRAM CELL, ASSOCIATED NOVEL SENSE AMP,ASSOCIATED SRAM AND CACHE CELL STRUCTURES, AND RELATED METHODS”.

FIELD OF THE INVENTION

The present invention relates generally to SRAM (Static Random AccessMemory) devices, and more particularly to low leakage power SRAM deviceshaving device performance comparable to conventional SRAM devices.

BACKGROUND

As a result of technology trends and the increased importance ofportable electronic devices, leakage (static) power dissipation hasemerged as a high priority design consideration in high-performanceprocessor design. Historically, architectural innovations for improvingperformance relied on exploiting ever larger numbers of transistorsoperating at higher frequencies. To keep the higher resulting switchingpower dissipation at bay, successive technology generations have reliedon reducing the supply voltage. In order to maintain performance,however, this has required a corresponding reduction in the transistorthreshold voltage. Since the Metal Oxide Semiconductor Field EffectTransistor (MOSFET) sub-threshold leakage current increasesexponentially with a reduced threshold voltage, leakage powerdissipation has grown to be a significant fraction of overall chip powerdissipation in modern, deep-submicron (<0.18 pm) processes. Moreover, itis expected to grow by a factor of five every newer chip generation. Forprocessors it is estimated that in 0.1 μm technology, leakage power willaccount for about 50% of the total chip power.

Since leakage power is proportional to the number of transistors, andgiven the projected large memory content of future System-on-Chip (SOC)devices, it becomes important to focus on Static Random Access Memory(SRAM) structures such as caches, which comprise the vast majority ofon-chip transistors in some systems. Existing circuit-level leakagereduction techniques are oblivious to program behavior, such as how manybits to be stored will be high or low, and trade off performance forreduced leakage where possible. Combined circuit and architecture-leveltechniques reduce leakage for those parts of the on-chip caches thatremain unused for long periods of time (for example, such as forthousands of cycles). The mechanisms that identify which cache partswill be unused and that enable leakage reduction incur considerablepower and performance overheads that have to be amortized over longperiods of time. As a result, these methods are not effective when mostof the cache is actively used.

There is a need for SRAM storage with reduced leakage power while havingcomparable performance characteristics. As such, power consumption maybe minimized while still providing the performance required in newgeneration systems and consumer devices.

SUMMARY

The present invention seeks to satisfy at least some of the above unmetneeds. Embodiments of the present invention include a family of improvedasymmetric SRAM cell designs that can be used in new SRAM and cachememory designs referred to as the Asymmetric-Cell Caches (ACC). ACCsoffer drastically reduced leakage power compared to conventional cacheseven when there are few parts of the cache that are left unused. ACCsexploit the fact that in ordinary programs most of the bits in cachesare zeroes for both the data and instruction streams. It has been shownthat this behavior persists for a variety of programs under differentassumptions about cache sizes, organization and instruction setarchitectures, even when assuming perfect knowledge of which cache partswill be left unused for long periods of time.

Conventional SRAM cells are symmetrically composed of transistors withcomparable leakage and threshold characteristics. The asymmetric SRAMcell designs of the present invention offer low leakage with little orno impact on latency. In asymmetric SRAM cells, selected transistors are“weakened” with respect to other transistors used in SRAM cells toreduce leakage power when the cell is storing a zero binary state (themost common case). Transistor weakening may be achieved by using highervoltage threshold (Vt) transistors, by varying transistor sizes,combinations of these approaches, or other means.

In addition to improved SRAM designs, the present invention alsodescribes a novel sense amplifier (SA) design that exploits theasymmetric nature of our cells to offer cell read times that arecomparable with conventional symmetric SRAM cells. Moreover, anembodiment of the present invention further presents a cache memorydesign based on ACCs that when compared to a conventional cache, thecache memory architecture of the present invention offers leakagereduction while maintaining high performance and comparable noisemargins and stability.

In one embodiment of the present invention there is disclosed anasymmetric SRAM cell for storing a binary variable. The asymmetric SRAMcell exhibits reduced leakage power with respect to a comparablesymmetric SRAM cell when the asymmetric SRAM cell stores a binaryvariable representing a predetermined binary value, such as a binary oneor binary zero. The asymmetric SRAM cell is made up of a plurality oftransistors of a first and second type operably coupled and configuredas an asymmetric SRAM cell. At least one of the second type oftransistor is made weaker than at least one of the first type oftransistor. The two types of transistors are then variously configuredsuch that the asymmetric SRAM cell achieves reduced leakage power withrespect to a symmetric SRAM cell having the first type of transistoronly.

The second type of transistor can be made weaker than the first type oftransistor in various ways. One way is to increase the voltage thresholdas compared to the voltage threshold of the first type of transistor.Another way is to decrease the channel width as compared to the channelwidth of the first type of transistor. Yet another way is to increasethe channel length as compared to the channel length of the first typeof transistor. Further, combinations of the above ways to maketransistors relatively weaker, as well as other ways to make transistorsrelatively weaker may be used.

In another embodiment of the present invention there is disclosed asense amplifier (SA) that exploits the characteristics of the asymmetricSRAM cell. A sense amplifier is coupled with an asymmetric SRAM cell andprovides faster access times when the asymmetric SRAM cell stores afirst predetermined binary value. The sense amplifier is comprised of afirst pair of cross coupled inverters across a bitline (BL) and abitline bar (BLB) and a second pair of cross coupled inverters operablycoupled with the first pair of cross coupled inverters. This isconventional up to this point. The present invention sense amplifierfurther includes a plurality of additional transistors forming a dummycolumn of cells that store a second predetermined binary value at alltimes wherein during a read operation of the SRAM cell one of the dummycells will have its wordline asserted. The dummy column of cells areoperably coupled with the first pair of cross coupled inverters. Thesense amplifier is driven by four inputs operably coupled with a subsetof transistors. The inputs include the BL and BLB that derive from theSRAM cell, as well as a dummy bit line (D), and a dummy bitline bar(DB). The D and DB are input to the dummy cells such that D is input tothe sense amplifier on the same side as BLB while DB is input to thesense amplifier on the same side as BL.

Moreover, the transistors coupled with BL and BLB have highertransconductance characteristics than the transistors coupled with D andDB. This is achieved either by varying the threshold voltage or alteringthe size of the transistor channel widths or channel lengths.

In yet another embodiment of the present invention there is disclosed anSRAM device comprised of an array of SRAM cells wherein each SRAM cellstores a binary variable representing a predetermined binary value. Inaddition, each SRAM cell is an asymmetric SRAM cell having reducedleakage power with respect to a comparable symmetric SRAM cell aspreviously described. The SRAM device can be configured as a directstore SRAM device, a selectively inverted SRAM device, or a cache memorydevice. If the SRAM device is a cache memory device then it can eitherbe configured as a direct store cache memory or a selectively invertedcache memory.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 illustrates an example circuit diagram of a conventional sixtransistor SRAM cell.

FIG. 2 illustrates a circuit diagram of basic asymmetric SRAM cell,according to one embodiment of the present invention.

FIG. 3 illustrates a circuit diagram of an asymmetric SRAM cellconfigured to address leakage, according to one embodiment of thepresent invention.

FIG. 4 illustrates a circuit diagram of an asymmetric SRAM cellconfigured to address leakage, according to one embodiment of thepresent invention.

FIG. 5 illustrates a circuit diagram of an asymmetric SRAM cellconfigured to address leakage and speed, according to one embodiment ofthe present invention.

FIG. 6 illustrates a circuit diagram of an asymmetric SRAM cellconfigured to address leakage and speed, according to one embodiment ofthe present invention.

FIG. 7 illustrates a circuit diagram of an asymmetric SRAM cellconfigured to address leakage and speed, according to one embodiment ofthe present invention.

FIG. 8 illustrates a circuit diagram of an asymmetric SRAM cell termed aspecial precharge cell, according to one embodiment of the presentinvention.

FIG. 9 illustrates a circuit diagram of an asymmetric SRAM cell termed astability leakage enhanced cell, according to one embodiment of thepresent invention.

FIG. 10 illustrates a circuit diagram of an asymmetric SRAM cell termeda stability speed enhanced cell, according to one embodiment of thepresent invention.

FIG. 11 illustrates a circuit diagram of an asymmetric SRAM cellconfigured to address leakage through differences in transistor sizing,according to one embodiment of the present invention.

FIG. 12 illustrates a circuit diagram of an asymmetric SRAM cellconfigured to address leakage and speed through differences intransistor sizing, according to one embodiment of the present invention.

FIG. 13 illustrates a conventional sense amplifier.

FIG. 14 illustrates a sense amplifier, according to one embodiment ofthe present invention.

FIG. 15 illustrates a data flow diagram illustrating using selectiveinversion of byte data to optimize use of asymmetric SRAM cells,according to one embodiment of the present invention.

DETAILED DESCRIPTION

Ideally, an SRAM cell should be fast and should dissipate low leakagepower. This is increasingly at odds with the fundamental technologytrade off between transistor speed and leakage. Conventional highperformance SRAM cells use a symmetric configuration of six transistorswith comparable threshold voltages. One can reduce leakage by usinghigher Vt transistors, but unfortunately using an all high Vt transistorcell degrades performance by an unacceptable margin.

The goal of the asymmetric SRAM cells of the present invention is toreduce leakage while maintaining high performance based on the followingapproach: select a preferred state and weaken only those transistorsnecessary to drastically reduce leakage when the cell is in that state.These cells exhibit asymmetric leakage and access behavior. Fortunately,their asymmetric access behavior can be exploited to maintain highperformance while reducing leakage.

For purposes of illustration, the following convention will be used. Ahigh Vt (HV) transistor is obtained from a basic 0.13 μm, 1.2V,transistor (referred to herein as the regular Vt (RV) transistor) byartificially increasing the Vt by 0.2V. 0.2V was chosen because it leadsto a difference of about 10 times between the leakage currents of HV andRV transistors, which is typical of dual Vt technology. Those ofordinary skill in the art will realize that other relative changes to Vtcan be implemented. The data values illustrated herein are but oneexample chosen to illustrate the results of the present invention whenthe asymmetric concept is applied. For illustration purposes, A “high Vttransistor” as used herein is defined as a transistor having arelatively higher “Vt” or threshold voltage than other transistorstypically used in an SRAM cell design. The reason for selectingtransistors having a higher Vt than others within the SRAM cell is toreduce the leakage current and thereby reduce an SRAM cell's leakagepower. Although the high Vt transistor example described herein has athreshold voltage (Vt) which is 0.2 volts higher, this is only anexample for a 1.2 volt, basic, 0.13 micron transistor. Different shiftsof Vt could be used, using either higher or lower Vt differentialvoltages, so long as the leakage current draw is reduced as required fora given SRAM cell design or application. Additionally, transistors intechnologies other than the basic 0.13 micron example can be used.

Moreover, the present invention has been described and illustrated usingMOSFET type transistors. Those of ordinary skill in the art canappreciate that other types of transistors and the like can besubstituted for MOSFETs.

FIG. 1 illustrates a conventional SRAM cell 10 comprised of twoinverters 12, 14, (P2, N2) and (P1, N1), and two pass transistors 16,18, N3 and N4. In the inactive state, a wordline (WL) 20 is held low sothat the two pass transistors 16, 18 are off isolating the cell from abitline (BL) 22 and bitline-bar (BLB) 24. At this stage the bitlines 22,24 are also typically charged at V_(DD) (e.g., logic ‘1’). Cells spendmost of their time in the inactive state. In this state, most of theleakage is dissipated by the transistors that are off and that have avoltage differential across their drain and source. The value stored inthe cell (i.e., the cell state) determines which transistors these are.When the cell is storing a ‘0’, as in FIG. 1, the leaky transistors areP1, N4 and N2. If the cell were storing a ‘1’ then transistors P2, N1and N3 would dissipate leakage power. A simple technique for reducingleakage power would be to replace all transistors with high-Vt ones, butthis unacceptably degrades the bitlines discharge times by 61.6%.

Since ordinary programs exhibit a strong bias in cache-resident bitvalues, another possibility to reduce leakage power, but at the sametime keep read access times short, is to choose a preferred stored valueand to only replace those transistors that contribute to the leakagepower in this state with HV transistors. This is illustrated in FIG. 2where P1, N4 and N2 have been made weaker with respect to P2, N1, andN3. This basic asymmetric SRAM cell 25 was simulated and exhibits thesame leakage as the RV cell 10 of FIG. 1 when holding a logic ‘1’, butits leakage is reduced by 70× when holding a logic ‘0.’

The read access time of the basic asymmetric cell is, however, degraded.Due to N2's and N4's higher threshold voltage, the bitline dischargetakes longer. The discharge times for BLB and BL are 12.2% and 46.4%longer than the discharge time for the RV cell, respectively. Dischargetime is defined as the time from when the wordline is raised to when oneof the bitlines reduces to 90% of its precharge value. The number 90%was chosen due to it being an appropriate differential signal for senseamplifiers to trigger.

P-Channel Metal Oxide Semiconductor (PMOS) transistors have very littleeffect on a cell's read access time because the role of pulling down thebitlines is played by the two n-channel Metal Oxide Semiconductor (NMOS)transistors on the side of the cell storing the ‘0’. Thus, a betterasymmetric cell can be configured using the basic asymmetric cell ofFIG. 2 with P2 also set to high Vt. This cell, shown in FIG. 3, isreferred to as the Leakage Improved 2 (L12) cell 30 and has theadvantage of partially reduced leakage in the high leakage state. Whenthe cell is holding a logic ‘1’ its leakage is reduced by 1.6× relativeto the RV cell, and when holding a logic ‘0’ its leakage is reduced by70×. The discharge times for BLB and BL are 12.2% and 46.4% longer thanthe discharge times for the RV cell, respectively, the same as the basicasymmetric cell's discharge times.

A further improvement is possible since by using a sense amplifier(described below) that matches the read time on the slow side of thecell to the fast side, there is no need for N1 to be low Vt. This leadsto the cell in FIG. 4, referred to as the Leakage Improved 3 (LI3) cell40 or leakage enhanced (LE) cell 40. This cell further reduces leakagein the high leakage state, so that its leakage relative to the RV cell10 is reduced by 7× in the ‘1’ state and by 70× in the ‘0’ state. The BLdischarge time is now 61.6% longer than the discharge time for the RVcell 10, but that is of minor importance due to the novel senseamplifier design, as we will see later. The two asymmetric cells, L12 30and L13 40, take the basic asymmetric cell 25 of FIG. 2 and improve itsleakage performance without affecting its read access time.

Another design challenge is to take the basic asymmetric cell 25 andimprove its read access time while keeping some of the leakage benefitsof the basic asymmetric cell 25. To eliminate the speed penalty incurredin the basic asymmetric cell 25 due to both pull-down paths having onehigh Vt transistor, both N2 and N3 are kept at low Vt while P1 is madehigh Vt. This cell is shown in FIG. 5 and is termed the Speed Improved I(SI1) cell 50. The SI1 cell 50 has discharge times for BLB and BL whichare 0% and 46.7% respectively longer than the RV cell 10. Thus one sideof the cell is just as fast as the RV cell 10. However, this cellsuffers from higher leakage than the basic asymmetric cell 25, with aleakage reduction of 2× relative to RV cell 10 when holding a ‘0’, andno leakage reduction when holding a ‘1’.

The same transformations performed on the basic asymmetric cell 25 toimprove its leakage performance can also be performed on the SI1 cell50. First, P2 is made high Vt (FIG. 6), and then N1 is also made high Vt(FIG. 7). These two new cells are named Speed Improved 2 (SI2) 60 andSpeed Improved 3 (SI3) 70, respectively. The SI2 cell 60 has leakagereductions of 2× and 1.6× when storing a ‘0’ and ‘1’, respectively,while the SB cell 70 has leakage reductions of 2× and 7×. The SI3 cell70 is also referred to as the Speed Enhanced (SE) cell 70.

These two cells have no read access time degradation compared to the RVcell 10 along BLB, but have a 46.5% and 61.6% degradation along BLrespectively. Once again, the degradation along BL is of minorimportance due to the novel sense amplifier.

Note that the SE cell 70 reverses the preferred leakage state to thestate when the cell is holding a ‘1’. All further references to thiscell will have the ‘1’ state as the preferred state so that the celllanguage remains in conformity with other cells. It should be noted thatin practice the cell bitlines can be flipped to allow for ‘0’ to be thepreferred state without affecting any of the performance or stabilityresults shown here.

One would like to combine the low leakage of the LI2 30 and LE 40 cellswith a very small read access delay. Yet another asymmetric celladdresses these objectives, but it requires a different read operation.In the steady state, instead of keeping BL precharged to V_(DD), it iskept at ground. Now, N4 18 can be kept low Vt for the preferred ‘0’state. This is termed the Special Precharge (SP) cell 80 and it is shownin FIG. 8. This asymmetric cell requires changes to the peripheralcircuits of the SRAM array. Nevertheless, the results for this cellindicate that leakage is reduced by 83.3× in the ‘0’ state, while the‘1’ state shows no leakage reduction. Bitline discharge times aredegraded by 12.2% and 0%, respectively, for this example.

Until now, only the bitline discharge times of the different cells havebeen compared, and write times have been ignored. The write times of thecells are less important because stronger write drivers can be designedto drive the bitlines, and write drivers are a small portion of thetotal SRAM. The write times of the asymmetric cells all lie within thewrite times of the RV cell and the HV cell.

The LE cell 40 and SE cell 70 are the two best designs from the two setsof asymmetric cells as indicated by test results. Therefore, only thesetwo cells, and variations on them, will be referenced in the remainderof this description.

Another major consideration with the cell design is its stability. Thereare two interrelated issues: read stability and noise margins. Readstability indicates how likely it is to invert the cell's stored valuewhen it is being accessed. This is computed as the ratio ofI_(trip)/I_(read), where I_(trip) is the current through the pull-downNMOS when the state of the cell is being reversed by injecting anexternal current I_(test), and where I_(read) is the maximum currentthrough the pass transistor during a read.

The static noise margin (SNW of an SRAM cell is defined as the minimumDC noise voltage necessary to flip the state of the cell. For thepresent invention, the stability of all cells was measured by simulationvia both the Static Noise Margin (SNM) and the I_(trip)/I_(read)methods. Under both stability tests, the stability was first measuredunder nominal conditions, assuming no process variations. Then, tomeasure stability under process variations, two sets of tests wereperformed. First, the SNM and I_(trip)/I_(read) tests were performed on59,049 combinations of different Vt and length variations for all sixtransistors in the cell. The combinations included modifying by {−3σ, 0,3σ} the NMOS transistors' Vt and length values and the PMOS transistors'Vt value. The worst case value for various cells was found, and comparedto the worst-case value obtained for the RV cell.

Second, Monte-Carlo analysis was performed to obtain a distribution forthe SNM and I_(trip)/I_(read). For each cell, 500 scenarios for Vt andchannel length were randomly generated, consistent with their jointdistributions, and simulated. The mean of the distribution was estimatedusing the unbiased estimator in (1), and the variance was estimated byusing the unbiased estimator in (2). Furthermore, the Normal ScoresMethod was used to graphically determine the distribution type. Giventhe distribution type, mean, and variance, the probability of failurefor various cells was then computed.

The SNM of the LE 40 and SE 70 cells were computed through simulation.The SNM of the RV cell 10 was also computed to be used as a reference.Under nominal conditions, the SNM of the LE 40 and SE 70 cells were0.246V and 0.221V, respectively, while the SNM of the RV cell 10 was0.250V. Thus, the LE cell 40 and SE cell 70 show a decrease in SNM of1.6% and 11.7%. One would expect that by using higher threshold voltagetransistors in the design, the SNM of the cells would increase, but theasymmetry of the cells skews the lobes of the butterfly curve anddecreases the SNM, as will be explained below.

First, let us examine the SNM of the cells when the wordline is notactive. During this state, the SRAM cell is not as vulnerable as when itis being read, but a study of this case helps to understand the decreasein the SNM when the cell is being read. When the wordline is off, theonly transistors that affect the SNM are the four transistors comprisingthe back-to-back inverters.

Since the four internal transistors of the LE cell 40 are all high Vt,the cell has equal low and high noise margins of 0.685V, a 22.6%increase over the standby SNM of the RV cell, 0.559V. However, when theSNM of the cell is being measured during a read the cell has high SNM inone state, 0.363V, and low SNM in the other, 0.246V. The asymmetry inthe LE butterfly curve is due to the mismatch between the strength ofthe pass-gate (N3) and pull-down (N2) transistors. During a read, the N3pass transistor 16, due to it being low Vt, has a higher conductivitythan N2 and raises the voltage at the storage node to a higher voltagethan if the two NMOS were of equal strength.

For the SE cell 70, the internal inverter pair are different. Thus thestandby (i.e., with the wordline off) SNM of the cell has asymmetriclobes with noise margins of 0.535V and 0.727V, in the worst case a 4.2%decrease in noise margin compared to the RV cell. The source of thismismatch is the Vt difference between N1 and N2, which causes one of thetransfer characteristics to commence its transition in the SNM plot from‘0’ to ‘1’ later than normal. During a read, the mismatch between thesize of the lobes becomes exaggerated because it is as if a constant issubtracted from the noise margin on each side of the cell since eachside of the cell has equal strength pass transistors and pull-downtransistors. While being read, the SE cell 70 has low and high noisemargins of 0.222V and 0.365V respectively.

The asymmetric cells' stability performance degrades compared to that ofthe RV cell. Since process variations induce an asymmetry in thebutterfly curve, the original asymmetry inherent in the butterfly curvesfor the LE 40 and SE 70 cells allows one lobe of the butterfly curve tobecome pinched off even further and lose stability. For the LE cell 40the butterfly curve becomes pinched off when N3 becomes stronger than N2and P1 increases in strength, while N1 does not. The worst case for theSE cell 70 occurs at a different process corner. The butterfly curvebecomes pinched off when P2 decreases in strength and N2 increases instrength, and N4 gets stronger than N1.

Monte-Carlo Analysis was also performed on the RV 10, LE 40 and SE 70cells. The Normal Scores method reveals that the distributions for allcells were Gaussian. Due to their very small standard deviation, the SNMof all cells remains very close to their respective mean average. Thusthe mean of the SNM becomes a very important measure, and is a betterreflection of the stability than the nominal or worst-case SNM. Usingthe mean as a measure of stability, the LE cell 40 has a 7% increase inSNM and the SE cell 70 has a 5.8% decrease.

Using the SNM as a measure of stability showed that the LE cell 40 wascomparable to the RV cell 10 while the SE cell 70 showed a marginaldecrease in stability. When I_(trip)/I_(read) is computed by simulation,it is seen that the SE cell 70 outperforms the RV cell 10 and the LEcell 40 suffers.

The LE cell 40 has a lower I_(trip)/I_(read) value due to the Vtmismatch between the pass transistor and pull-down transistor on oneside of the cell. The I_(trip) values from both sides of the cell show adrop compared to the I_(trip) value from the RV cell 10 due to bothpull-down transistors becoming high Vt. However, with N3 16 remaininglow Vt, I_(read) on the fast side of the cell does not suffer the samedrop, and I_(trip)/I_(read) falls compared to that of the RV cell 10.

The SE cell 70, due to it having the same strength pull-down and passtransistors 16, 18 on each side of the cell, does not experience thesame problem as the LE cell 40. On the slow side of the cell, bothI_(trip) and I_(read) fall compared to the RV cell 10, but I_(read)falls by a larger amount thus increasing the I_(trip)/I_(read). On thefast side of the cell, I_(read) does not change compared to the RV cell10, but I_(trip) increases slightly. In the RV cell 10, the reduction involtage (due to leakage) at the stored ‘1’ node degrades the currentsinking capacity of the pull-down NMOS. In the SE cell 70, because ofthe high Vt transistors on the ‘1’ side of the cell there is nodegradation in the current sinking capacity of the pull-down transistorand thus I_(trip) increases leading to a larger I_(trip)/I_(read).

A total of 59,049 different corner cases of process variations weresimulated and the worst case I_(trip)/I_(read) was noted in each cell.The LE cell 40 and the RV cell 10 achieve their worst-caseI_(trip)/I_(read) for the same process corner: when the difference instrength between N2 and N3 is amplified with N2 becoming weaker, and N316 becoming stronger. The SE cell 70, however, suffers its worst-caseI_(trip)/I_(read) when N4 18 becomes stronger than N1.

Monte-Carlo analysis show that I_(trip)/I_(read) is also Gaussian fromthe linear plots obtained from the Normal Scores Method. The standarddeviation is very small and most cells will be very near the mean wherethe LE shows a 4.35% decrease and the SE cell 70 shows a 14.84% increasein I_(trip)/I_(read).

The SE 70 and LE 40 cells have either a lower stability in the SNM testor the I_(trip)/I_(read) test. In many cases, the stability of the cellis a critical factor to obtain a desired yield and to lower the cost ofthe chip. In that regard, two derivative cells, one from the LE cell 40and one from the SE cell 70, have been developed that improve upon theirSNM, but do not decrease the leakage as much as the SE 70 and LE 40cells. The two new cells are named Stability-Leakage Enhanced (SLE) 90and Stability-Speed Enhanced (SSE) 100 and are illustrated in FIGS. 9and 10 respectively.

One way to improve the SNM of the cells under process variations is totry to make the size of the lobes of the butterfly curve symmetric. Forthe LE cell 40 the lobes can be made more symmetric by making N2 low Vt,but this new cell would just be the SE cell 70. Another option is tomake P1 low Vt. This change, shown in FIG. 9, makes the lobes of thebutterfly curve more symmetric. The SNMs are now 0.360V and 0.283Vinstead of 0.363V and 0.246V. To make the SE cell's 70 SNM plot moresymmetric, P2 can be made low Vt yielding SNMs of 0.256V and 0.362Vinstead of 0.222V and 0.366V.

For these stability improved cells, all the previous tests for leakage,performance, and stability can be performed to compare them to the cellsthey were derived from, as well as to the RV cell 10.

The leakage performance of the stability improved SLE 90 and SSE 100cells falls off, as expected due to one transistor in the LE 40 and SE70 cells being re-converted to a low Vt transistor. For the SLE cell 90,the leakage reduction when holding a ‘1’ remains unchanged at a 6.96×reduction relative to RV cell 10, but the leakage reduction when holdinga ‘0’ changes from 69.5× to 2.5×. For the SSE cell 100, when it isholding a ‘0’ the leakage reduction stays at 2.04×, but when it isholding a ‘1’ the leakage reduction changes from 6.96× to 1.91×.

Since the PMOS transistors do not play a large role in discharging thebitlines, it would be expected that the discharge time for the stabilityimproved cells to be very close to the cells they derived from. Throughsimulation, it is seen that the discharge times along BL and BLB remainalmost constant. As for the write times, SLE cell's 90 write timedecreases to a 33.15% increase over RV cell's 10 write time from LEcell's 40 35.95% increase. The SSE cell's 100 write time jumps to a49.22% increase over the RV cell's 10 write times.

A stability analysis has also been performed on the derivative cells forboth the SNM and I_(trip)/I_(read). Both derivative cells perform betterthan the RV cell 10 in the worst case, and under Monte-Carlo analysis.Under the I_(trip)/I_(read) method, there is very little change, becauseI_(trip)/I_(read) depends strongly on the NMOS transistors, which havenot been changed, but the stability-improved cells perform slightlyworse than the cells from which they were derived.

It has been shown that when stability is recovered through a change inthreshold voltage of the PMOS transistors, a large portion of theleakage benefits of the asymmetric cells are lost. Furthermore, theI_(trip)/I_(read) of the LE cell 40 could not be improved by thresholdvoltage assignment. Another way of improving stability is to resize someof the transistors to reclaim the conductance lost due to the high Vtassignment. This change does not have a large effect on the leakagecharacteristics because leakage increases exponentially with reducedthreshold voltages, but increases only linearly with transistor size.Moreover, the low I_(trip)/I_(read) of the LE cell 40 can be improved bytransistor resizing.

The lobes of the SNM plot for the SE cell 70 can be made more symmetricby making N1 wider. In our case, we increased the width of thistransistor by 26%, leading to a new cell shown in FIG. 11 and referredto as Resized Speed Enhanced (RSE) 110. The SNM for the RSE cell 110 iscomparable to that of the RV cell 10 and the change in N1's size leadsto an increase of only 2.9% in cell area The SNM margins are now 0.253Vand 0.347V instead of 0.222V and 0.366V. The RSE cell's 110 nominalvalue for I_(trip)/I_(read) does not change much compared to the nominalvalue for the SE cell 70. On the slow side of the cell, which had thehigher I_(trip)/I_(read) value for the SE cell 70, the increase in N1'ssize allows for I_(trip) to become larger and increases theI_(trip)/I_(read) value. The fast side of the cell however, which hasthe limiting I_(trip)/I_(read) value, has a reduced I_(trip) thatreduces the final value of I_(trip)/I_(read) to 2.53. The reduction inI_(trip) is due to the ‘1’ storage node having a slightly lower voltagedue to the increased leakage through N1. Nevertheless, the RSE cell's110 I_(trip)/I_(read) value is still 11.8% better than that of the RVcell 10.

For the LE cell 4.0 increasing the width of N2 allows the conductance ofN2 to approach that of N3 16, which leads to an increase in I_(trip),thus increasing I_(trip)/I_(read). By increasing N2's width by 22%,(leading to an only 2.4% increase in cell area) the I_(trip)/I_(read)value of the new Resized Leakage Enhanced (RLE) cell 120 (FIG. 12) wasmade to be 2.28, which is comparable to the I_(trip)/I_(read) value of2.26 of the RV cell 10. The increase in N2's width also increases theSNM of the RLE cell 120 where the margins are now 0.349V and 0.280Vinstead of 0.363V and 0.246V.

As expected, the leakage performance of the resized cells is better thanthat of the SLE 90 and SSE 100 cells. For the RLE cell 120 the leakagereduction when holding a ‘1’ remains unchanged at a 6.96× reductionrelative to RV cell 10, but the leakage reduction when holding a ‘0’only slightly reduces from 69.5× to 57.9×. The SLE cell's 90 leakagereduction when holding a ‘0’ was only 2.5×. When the RSE cell 110 isholding a ‘0’ the leakage reduction stays at 2.04× relative to RV cell10, and when it is holding a ‘1’ the leakage reduction only changes from6.96× to 6.79×. This change is also minimal when compared to the SSEcell's 100 leakage reduction of 1.9 1×.

Due to the increased size of the pull-down NMOS transistors, the resizedcells have the potential of improving the read-access time of the cell.For the RLE cell 120 the discharge time along BLB remains at a 61.1%increase over the RV cell's 10 BLB discharge time, but the BL dischargetime is now only 3.7% longer than the RV cell's 10 discharge time. Asnoted previously, only the BL discharge time is important due to thetimed read based on a new sense amplifier. For the RSE cell 110, thedischarge time along the fast side of the cell, BL, does not change, butthe discharge time along BLB is reduced from the SE cell's 70 61.7%increase over RV cell 10 to a 49.2% increase over RV cell 10. This extraperformance along BLE plays no important role in the cell's performance.As for the write times, the RLE cell's 120 write time increases to a 39%increase over RV cell's 10 write time from LE cell's 40 3 5.95%increase. The RSE cell's 110 write time jumps to a 45% increase over RVcell's 10 write times.

The stability analysis has also been performed on the resized cells forboth the SNM test and I_(trip)/I_(read) test. Both resized cells performbetter than the RV cell in the worst case, and under Monte-Carloanalysis for the SNM. Under the I_(trip)/I_(read) test, the RLE cell 120now performs better than RV cell 10 both in the worst-case and onaverage. The increase in N1's size accomplishes the higherI_(trip)/I_(read). The RSE cell's 110 I_(trip)/I_(read) value alsoincreases slightly under all tests, even surpassing the SE cell's 70I_(trip)/I_(read) value in the worst case. With a larger pull-downtransistor, the process variations do not have as much an effect on theRSE cell's 110 stability.

Another figure of merit for the different cells is their stability underdifferent supply voltages. For the technology being used, the nominalsupply voltage is 1.2V. Monte-Carlo analysis has been performed for theRV 10, LE 40, SLE 90, RLE 120, SE 70, SSE 100 and RSE 110 cells forsupply voltages ranging from 0.75V to 1.6V.

For voltages above 1.2V, LE 40, SLE 90 and RLE 120 improve their SNMadvantage over the RV cell 10. With a higher VGS, the difference inconductance between the pass-gate (N3) and pull-down (N2) transistors,which was the root cause of the low stability at 1.2V, diminishes. Athigher voltages, the SNM of the SE 70 and SSE 100 cells starts todiminish just as the SNM of the RV cell 10 but at a lower rate. The SNMof the RSE cell 110 levels off at higher voltages.

With lower supply voltages, the SNM of the asymmetric cells starts tosuffer. For the LE 40, SLE 90 and RLE 120 cells, the SNM decreasesrapidly, but the SLE cell's 100 SNM remains comparable to that of RVcell 10, while the RLE cell's 120 SNM becomes comparable to that of theLE cell's 40. This decrease in stability is caused by the difference inconductance between regular voltage and higher voltage transistors atlow VGS's. Furthermore, at low VGS, the extra conductance of the largertransistor in the RLE cell 120 does not have a large effect since thetransistor is not fully on. The SNM of SE 70, SSE 100 and RSE 110 alsodecreases, but not as fast as that of the LE cell 40. Again, thisdecrease in SNM is due to the difference in conductance at low VGS's.

The same tests were performed for the I_(trip)/I_(read) method with theresult that the curves for all cells are much better behaved, The SE 70and SSE 100 cells have a near 24% advantage over the RV cell at 0.75Vand an 8% advantage at 1.65V. The LE 40 and SLE 90 cells haveapproximately a 16% decrease in I_(trip)/I_(read) at 0.75V and arecomparable at 1.65V to the RV cell 10. The resized cells behave slightlydifferently, with the RSE cell 110 having an 11.7% improvement at 1.65Vand a 32.2% improvement at 0.75V. The RLE cell 120 has a 9.6%improvement at 1.65V and a 4% decrease at 0.75V.

A conventional sense amplifier 130 is shown in FIG. 13. It is notsuitable for the present invention due to the slow access time when thecell is storing a ‘0’. To obtain fast read times regardless of the datavalue, a new sense amplifier 140 has been designed and is shown in FIG.14. Compared to the conventional sense amplifier 130, the new senseamplifier 140 has four additional transistors 142, 144, 146, 148 and anarea increase of roughly 0.229 μm² or 14.4%.

In addition to BL 132 and BLB 134, the sense amplifier 140 has two newinputs, D 150 and DB 152. These are connected to a dummy column of cellsthat store ‘1’ at all time, but which are otherwise exactly identical toall other cells in the array. This dummy column extends the full lengthof the SRAM array such that during every read operation, one of thedummy cells will have its wordline asserted. Since the dummy cellsalways store a ‘1’, they are always fast on the discharge (as fast asthe fast side of any other cell), and they are used to provide somethinglike a timer signal. This is achieved by connecting the dummy bitlines150, 152 to the sense amplifier 140 in a reverse way. D 150 is connectedto the right side, where BLB 134 is connected, and DB 152 is connectedto the left side, where BL 132 is connected. This enables D 150 and DB152 to trigger a fast read of a‘0’ result when the cell being read has a‘0’ content.

Sensing a ‘1’ is as fast as a conventional sense amplifier 130 sincethis is done by sensing a discharge of BLB 134 due to the action of thefast side of the cell. Sensing a ‘0’ is initiated at a later time thanit would be in a conventional sense amplifier 130 to allow sufficienttime for the fast side to trigger the sense amplifier 140 if it has todo so. While initiating the sensing for a ‘0’ is delayed, the combinedeffect of the dummy cell and the slow side of the asymmetric cell makesthe sensing process itself much faster once initiated, so that the endresult becomes available at about the same time as it would when sensinga ‘1’.

The detailed operation of the sense amplifier 140 is as follows.Initially, the bitlines 132, 134 are precharged and all four amplifierinputs rise to V_(DD). During this phase the sense amplifier 140 isbeing reset and nodes A and B are reset to an intermediate value. Duringa read operation, either BLB 134 will discharge (cell has a ‘1’, fastdischarge from the fast side) or BL 132 will discharge (cell has a ‘0’,slow discharge from the slow side). Furthermore the signal DB 152, whichis on the fast side of the dummy cell, will be discharged since thedummy cells permanently hold a logic ‘1’. If BLB 134 is being dischargeda logic ‘1’ is being sensed and the differential pair comprised of N1and N2 causes increased current to pass through the left branch, thusincreasing the voltage at node B and decreasing the voltage at node A.Through the positive feedback loop of P1, P2, N5, and N6, the rate ofchange for nodes A and B are increased to achieve quick sensing. When BL132 is being discharged a logic ‘0’ is being sensed. It does so at aslower rate since it is being discharged from the slow side of theasymmetric cell. To achieve fast sensing in this case, the dummybitlines 150, 152, which are connected to the differential pair of N3and N4, initiate the sensing of a logic ‘0’. Through the combined effectof DB 152 and BL 132 being discharged, albeit at a slower rate,approximately symmetric sense times are achieved.

For this sensing scheme to achieve reliable results it must allow foradequate time for BLB 134 to discharge before initiating a logic ‘0’read. This safety factor is achieved in two ways. First, the dummybitlines 150, 152 are connected to all sense amplifiers and thereforehave a slightly higher capacitive load compared to real bitlines 132,134 leading to a slower discharge on DB 152 compared to BLB 134. Theextra capacitive loading does not slow the sense time when BL 132 isdischarging because of the concerted effort between BL 132 and DB 152 tosense the same value. Second, the transistors connected to the bitlines132, 134 are wider than the transistors connected to the dummy bitlines150, 152 leading to a higher transconductance and higher gain from thebitlines 132, 134 to the output than from the dummy bitlines 150, 152.

To limit the sense power, the sense amplifiers are clocked. The senseclock turns on the amplifiers and sets them up in their high gain regionbefore the sensing occurs. To improve yield and ensure low-poweroperation, the clock path is matched to the data path. Matching isachieved by using an extra set of dummy bitlines to match the bitlinedelay and clock the sense amplifiers at the appropriate time.

Using the above cells and the sense amplifier 140 presented above, a32-Kbyte SRAM example was designed and simulated to measure leakage, andread and write times. Each of the 128 SRAM sub-arrays contains 64 cellsalong each bitline, and 32 cells along each wordline. The SRAM wassimulated at a temperature of 110° C. with the RV cell 10, basicasymmetric, LE 40, SLE 90, RLE 120, SE 70, SSE 100, RSE 110 and HV 25cells. Furthermore, the RV 10 and HV cells 25 were simulated with aconventional sense amplifier 130, and these results were used as areference for our design.

The leakage trends seen above for the single cell remain true for thecomplete SRAM, where the LE 40 and SE 70 cells offer a reduction of 70×and 2× while storing a ‘0’ and a reduction of about 7× when storing a‘1.’ The stability improved cells, and the resized cells also show thesame leakage trends from the single cell experiments.

The total SRAM read access time includes four components: 1) inputregister propagation delay and hold times; 2) the address decodingdelay; 3) the delay for wordline, bitline and sensing; and 4) the outputregister setup time. Only the delay for wordline, bitline and sensing isaffected by the cell design. Specifically, this time is the time periodfrom when precharging is complete to when the sense amplifier hasreached 90% of its swing.

While the discharge times are asymmetric, the worst-case sensing timesare on par with the RV cell with a conventional sense amplifier 130.Compared with the RV cell 10 with a conventional sense amplifier 130,the LE cell 40 is 10% slower. The effect on the total read time is anincrease of just under 5%, however. The SE cell 70 is slightly fasternot because the sense amplifier 140 is quicker, but because the bitlinedischarge time for the SE cell 70 is 50 ps quicker than that of the RVcell 10, which is a by-product of the asymmetry of the SE cell 70.Furthermore, the RLE cell 120 has a worst-case sense time that is 2.5%slower than the RV cell 10, with the effect on total read time beingnear 1%. Interestingly, the HV cell 25 with a conventional senseamplifier 130 would be 26% slower.

An important side comment to be made is that the new sense amplifier 140does not speed up the sensing for the RV 10 and HV 25 cells whencompared to the sensing with the conventional sense amplifier. Indeed,the RV 10 and HV 25 cells with the new sense amplifier 140 haveworst-case sense times that are 5% slower than the sense times with theconventional sense amplifier 130. Thus, in comparing the speed of thenew cells with the new sense amplifier 140 to the conventional cellswith the conventional sense amplifier 130, the comparison is fair andvalid, because the new sense amplifier 140 on its own does not speed upthe read access time of the conventional cells.

The LE 40 and SE 70 cells exhibit a write time increase of 19.4% and25.3% respectively over the RV cell 10. The SLE 90 and SSE 100 cellsexhibit an increase of 28.4% and 13.4% respectively, and the RLE 120 andRSE 110 exhibit an increase of 22.4% and 27.6% respectively. Theincrease in write times is of minor importance since the write times areall shorter than the read times of the associated cells and thereforethe speed of the SRAM is dependent on the read time.

The present invention also analyzes two cache organizations that useasymmetric cell designs: statically biased and dynamic inversion. In thestatically biased cache, the cells are simply replaced with asymmetricones. This cache is statically biased to dissipate low leakage poweronly when it stores the preferred bit value ‘0’. What makes this cachesuccessful is typical program behavior that exhibits a strong biastowards zero. Specifically, we observed that a level-1 data cache had anaverage 78.7% zeros in the data stream, and a level-1 instruction cachehad an average of 62.9% zeros. Given this, the statically biased cachewith the SE cells reduces leakage by 4.5× and 3.8× for an instructionand a data cache, respectively, compared to conventional symmetric cellcaches. The caches are 39 Kbyte 4-way set associative caches. Whileprograms with a higher fraction of ‘1’s than ‘0’s may exist, our SRAMwould still dissipate much lower leakage power compared to the regularVt cell cache.

In selective inversion, the values stored within a block can be invertedat a byte granularity (other granularities are possible). In thisdesign, if a byte contains five or more ones it is inverted prior tostoring it in the cache. This cache needs an additional inversion flagcell per byte that holds information on which bytes were inverted.Inversion happens at write time. Since stores are typically buffered ina write buffer and are only sent to the data cache on commit, there isplenty of time to decide and apply inversion if necessary. A logic flowdiagram for this procedure is illustrated in FIG. 15.

The present invention presents a novel approach that combines bothcircuit and architecture level techniques for drastically reducingleakage power dissipation. A key observation behind the presentinvention is that cache-resident memory values of ordinary programsexhibit a strong bias towards zero or one at the bit level. The presentinvention has introduced a family of high-speed asymmetric dual-Vt SRAMcell designs that exploit this bit-level bias to reduce leakage powerwhile maintaining high performance.

Various asymmetric cells offer different performance/leakage/stabilitycharacteristics. The SE cell reduces leakage power by at least 2× and by7× in the preferred state. It is as fast as the conventional, RV, SRAMcell. By comparison, the LE cell reduces leakage by at least 7× and byabout 70× in the preferred state. Its total read time is only 5% higherthan the SE and RV cells. These latter two cells have lower stabilitythan LE under both the SNM and the t_(trip)/I_(read) tests. Four othercells that compensate for stability were also designed, two by choosingdifferent combinations of threshold voltages for the cell transistors,and two by changing some transistor sizes. The SSE cell reduces leakagepower by 1.9× and 2.3× in the preferred state with no performancedegradation, and the SLE cell reduces leakage power by 2.3× and 7× inthe preferred state with only a 5% increase in read access times. TheSSE and SLE cells have comparable stability to the RV cell. The RLE cellreduces leakage by 58× in the preferred state and by 7× in the otherstate with only a 1% increase in read access time, and an area increaseof about 2.4%. The RSE cell reduces leakage by about 7× in the preferredstate, and 2× in the other state. It has no performance degradation, buthas an area increase of about 2.9%. The RLE and RSE cells havecomparable stability to the RV cell. By comparison, an all high Vt cellreduces leakage power by about 70× while its bitline discharge time is60% slower than the SE and RV cells.

The present invention also presents two cache organizations that useeither a static bias towards zero, or dynamic, selective inversion tomaximize the number of cache bits that are zero. While the reductionpossible with either technique depends on application behavior, thestatically biased cache with the SE cells reduces leakage by 4.5× and3.8× for an instruction and a data cache, respectively, as compared toconventional symmetric-cell caches.

The preceding description has focused on SRAM cell designs that werecomprised of six transistors. The principles of the present inventionwere described and applied to a six transistor design for ease ofillustration. It should be noted, however, that the same asymmetricprinciples of the present invention may also be applied to other SRAMcell designs including, but not limited to, those comprised of fourtransistors and two resistors.

It is the asymmetric nature of the present invention that provides thenovelty and uniqueness rather than a particular SRAM architecture. Thus,SRAM cell designs, as well as sense amplifiers and SRAM devicescomprised of arrays of SRAM cells, that exhibit asymmetric transistordesign characteristics are considered within the scope of the presentinvention.

Specific embodiments of an invention are described herein. One ofordinary skill in the circuit design and computing arts will quicklyrecognize that the invention has other applications in otherenvironments. In fact, many embodiments and implementations arepossible. The following claims are in no way intended to limit the scopeof the invention to the specific embodiments described above.

1. An asymmetric SRAM cell for storing a binary variable, the asymmetricSRAM cell having reduced leakage power with respect to a comparablesymmetric SRAM cell when the asymmetric SRAM cell stores a binaryvariable representing a predetermined binary value, the asymmetric SRAMcell comprising: a plurality of transistors operably coupled andconfigured as an asymmetric SRAM cell, wherein the plurality oftransistors include at least one first type of transistor and at leastone second type of transistor that is weaker than the first type oftransistor, such that the configuration of the asymmetric SRAM cellachieves reduced leakage power with respect to a symmetric SRAM cellhaving the first type of transistor only.
 2. The asymmetric SRAM cell ofclaim 1 wherein at least one of the second type of transistor isselected from among the group consisting of: a transistor having ahigher voltage threshold (V_(t)) as compared to the voltage threshold(V_(t)) of the first type of transistor; a transistor having a decreasedchannel width as compared to the channel width of the first type oftransistor; and a transistor having an increased channel length ascompared to the channel length of the first type of transistor.
 3. Asense amplifier for coupling with an asymmetric SRAM cell that providesfaster access times when the asymmetric SRAM cell stores a firstpredetermined binary value, said sense amplifier comprised of: a firstpair of cross coupled inverters across a bitline (BL) and a bitline bar(BLB); a second pair of cross coupled inverters operably coupled withthe first pair of cross coupled inverters; a plurality of additionaltransistors forming a dummy column of cells that store a secondpredetermined binary value at all times wherein during a read operationof the SRAM cell one of the dummy cells will have its wordline asserted,said dummy column of cells operably coupled with the first pair of crosscoupled inverters; and four inputs operably coupled with a subset oftransistors of the sense amplifier wherein the inputs include the BL,the BLB that derive from the SRAM cell, a dummy bit line (D), and adummy bitline bar (DB) that are input to the dummy cells such that D isinput to the sense amplifier on the same side as BLB while DB is inputto the sense amplifier on the same side as BL.
 4. The sense amplifier ofclaim 3 wherein at least one of the transistors coupled with BL and BLBhave higher transconductance characteristics than at least one of thetransistors coupled with D and DB.
 5. The sense amplifier of claim 3wherein at least one of the transistors coupled with BL and BLB areselected from among the group consisting of: transistors having a lowervoltage threshold (V_(t)) as compared to the voltage threshold (V_(t))of the transistors coupled with D and DB; transistors having a increasedchannel width as compared to the channel width of the transistorscoupled with D and DB; and transistors having a decreased channel lengthas compared to the channel length of the transistors coupled with D andDB.
 6. An SRAM device comprising: an array of SRAM cells wherein eachSRAM cell stores a binary variable representing a predetermined binaryvalue, and each SRAM cell is an asymmetric SRAM cell having reducedleakage power with respect to a comparable symmetric SRAM cell, eachasymmetric SRAM cell comprising: a plurality of transistors operablycoupled and configured as an asymmetric SRAM cell, wherein the pluralityof transistors include at least one of a first type of transistor and atleast one of a second type of transistor that is weaker than the firsttype of transistor, such that the configuration of each asymmetric SRAMcell achieves reduced leakage power with respect to a symmetric SRAMcell having the first type of transistor only.
 7. The SRAM device ofclaim 6 wherein the array of SRAM cells in the SRAM device comprises anSRAM device selected from the group consisting of a direct store SRAMdevice and a selectively inverted SRAM device.
 8. The SRAM device ofclaim 6 wherein the array of SRAM cells in the SRAM device comprises acache memory selected from the group consisting of a direct store cachememory and a selectively inverted cache memory.
 9. A combination SRAMdevice and sense amplifier comprising: an array of SRAM cells whereineach SRAM cell stores a binary variable representing a predeterminedbinary value, and wherein each SRAM cell is an asymmetric SRAM cellhaving reduced leakage power with respect to a comparable symmetric SRAMcell, each asymmetric SRAM cell comprising: a plurality of transistorsoperably coupled and configured as an asymmetric SRAM cell, wherein theplurality of transistors include at least one of a first type oftransistor and at least one of a second type of transistor that isweaker than the first type of transistor, such that the configuration ofeach asymmetric SRAM cell achieves reduced leakage power with respect toa symmetric SRAM cell having the first type of transistor only; and atleast one sense amplifier comprised of: a first pair of cross coupledinverters across a bitline (BL) and a bitline bar (BLB); a second pairof cross coupled inverters operably coupled with the first pair of crosscoupled inverters; a plurality of additional sense amplifier transistorsforming a dummy column of cells that store a second predetermined binaryvalue at all times wherein during a read operation of the SRAM cell oneof the dummy cells will have its wordline asserted, said dummy column ofcells operably coupled with the first pair of cross-coupled inverters;and four inputs operably coupled with a subset of the sense amplifiertransistors wherein the inputs include the BL, the BLB that derive fromthe SRAM cell, a dummy bit line (D), and a dummy bitline bar (DB) thatare input to the dummy cells such that D is input to the sense amplifieron the same side as BLB while DB is input to the sense amplifier on thesame side as BL.
 10. The combination SRAM device and sense amplifier ofclaim 9 wherein the sense amplifier transistors coupled with BL and BLBhave higher transconductance characteristics than the sense amplifiertransistors coupled with D and DB.
 11. The combination SRAM device andsense amplifier of claim 9 wherein at least one of the sense amplifiertransistors coupled with BL and BLB are selected from among the groupconsisting of: transistors having a lower voltage threshold (V_(t)) ascompared to the voltage threshold (V_(t)) of the transistors coupledwith D and DB; transistors having a increased channel width as comparedto the channel width of the transistors coupled with D and DB; andtransistors having a decreased channel length as compared to the channellength of the transistors coupled with D and DB.
 12. The combinationSRAM device and sense amplifier of claim 9 wherein the SRAM devicecomprises an SRAM device selected from the group consisting of a directstore SRAM device and a selectively inverted SRAM device.
 13. Thecombination SRAM device and sense amplifier of claim 12 wherein thearray of SRAM cells in the SRAM device comprises a cache memory selectedfrom the group consisting of a direct store cache memory and aselectively inverted cache memory.